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Abstract — The decoding of Low-Density Parity-Check codes 
by the Belief Propagation (BP) algorithm is revisited. We 
check the iterative algorithm for its convergence to a codeword 
(termination), we run Monte Carlo simulations to find the 
probability distribution function of the termination time, nn. 
Tested on an example [155, 64, 20] code, this termination curve 
shows a maximum and an extended algebraic tail at the highest 
values of nit. Aiming to reduce the tail of the termination 
curve we consider a family of iterative algorithms modifying the 
standard BP by means of a simple relaxation. The relaxation 
parameter controls the convergence of the modified BP algo- 
rithm to a minimum of the Bethe free energy. The improvement 
is experimentally demonstrated for Additive-White-Gaussian- 
Noise channel in some range of the signal-to-noise ratios. We 
also discuss the trade-off between the relaxation parameter of 
the improved iterative scheme and the number of iterations. 

Low-Density Parity-Check (LDPC) codes [1], [2] are the 
best Unear block error-correction codes known today [3]. 
In addition to being good codes, i.e. capable of decoding 
without errors in the thermodynamic limit of an infinitely 
long block length, these codes can also be decoded effi- 
ciently. The main idea of Belief Propagation (BP) decoding 
is in approximating the actual graphical model, formulated 
for solving statistical inference Maximum Likelihood (ML) 
or Maximum-A-Posteriori (MAP) problems, by a tree-like 
structure without loops. Being efficient but suboptimal the 
BP algorithm fails on certain configurations of the channel 
noise when close to optimal (but inefficient) MAP decoding 
would be successful. 

BP decoding allows a certain duality in interpretation. 
First of all, and following the so-called Bethe-free energy 
variational approach [4], BP can be understood as a set of 
equations for beliefs (BP-equations) solving a constrained 
minimization problem. On the other hand, a more traditional 
approach is to interpret BP in terms of an iterative procedure 
— so-called BP iterative algorithm [1], [5], [2]. Being iden- 
tical on a tree (as then BP equations are solved explicitly by 
iterations from leaves to the tree center) the two approaches 
are however distinct for a graphical problem with loops. In 
case of their convergence, BP algorithms find a minimum of 
the Bethe free energy [4], [6], [7], however in a general case 
convergence of the standard iterative BP is not guaranteed. 
It is also understood that BP fails to converge primarily due 
to circling of messages in the process of iterations over the 
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loopy graph. 

To enforce convergence of the iterative algorithm to a 
minimum of the Bethe Free energy some number of modifi- 
cations of the standard iterative BP were discussed in recent 
years. The tree-based re-parametrization framework of [8] 
suggests to limit communication on the loopy graph, cutting 
some edges in a dynamical fashion so that the undesirable 
effects of circles are suppressed. Another, so-called concave- 
convex procedure, introduced in [9] and generalized in [10], 
suggests to decompose the Bethe free energy into concave 
and convex parts thus splitting the iterations into two sequen- 
tial sub-steps. 

Noticing that convergence of the standard BP fails mainly 
due to overshooting of iterations, we develop in this pa- 
per a tunable relaxation (damping) that cures the problem. 
Compared with the aforementioned alternative methods, this 
approach can be practically more advantageous due to its 
simplicity and tunability. In its simplest form our modifica- 
tion of the BP iterative procedure is given by 
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where Latin and Greek indexes stand for bits and checks and 
the bit/check relations, e.g. i ^ a, a 3 i express the LDPC 
code considered; hi is the channel noise-dependent value of 

(n) 

log-likelihoods; and -q^^ is the message associated at the n- 
th iteration with the edge (of the respective Tanner graph) 
connecting i-th bit and a-th check. A is a tunable param- 
eter By choosing a sufficiently small A one can guarantee 
convergence of the iterative procedure to a minimum of the 
Bethe free energy. On the other hand A — +oo corresponds 
exactly to the standard iterative BP. In the sequel we derive 
and explain the modified iterative procedure Q in detail. 

The manuscript is organized as follows. We introduce 
the Bethe free energy, the BP equation and the standard 
iterative BP in Section |I] Performance of standard iterative 
BP, analyzed with a termination curve, is discussed in Section 
im Section |ni] describes continuous and sequentially discrete 
(iterative) versions of our relaxation method. We discuss 
performance of the modified iterative scheme in Section 
II VI where Bit-Error-Rate and the termination curve for an 
LDPC code performed over Additive- White-Gaussian-Noise 
(AWGN) channel are discussed for a range of interesting 
values of the Signal-to-Noise-Ratios (SNR). We also discuss 
here the trade-off between convergence and number of it- 



erations aiming to find an optimal strategy for selection of 
the model's parameters. The last Section |V] is reserved for 
conclusions and discussions. 

I. Bethe Free Energy and Belief Propagation 

Consider a generic factor model [11], [12], [13] with a 
binary configurational space, (7^ = ±1, i = 1, • • • , n, which 
is factorized so that the probability p{(Ti} to find the system 
in the state {ai} and the partition function Z are 



p{ai} = Z ^]J/a(cra), Z = ^ J| (CTq ) , 



(2) 



where a labels non-negative and finite factor-functions 
with a = 1 , . . . , m and ctq represents a subset of di 
variables. Relations between factor functions (checks) and 
elementary discrete variables (bits), expressed as i G a and 
a 3 i, can be conveniently represented in terms of the 
system-specific factor (Tanner) graph. If i G a we say that 
the bit and the check are neighbors. Any spin (a-posteriori 
log-likelihood) correlation function can be calculated using 
the partition function, Z, defined by Eq. General expres- 
sion for the factor functions of an LDPC code is 



faicTa) = exp ^ KcFi/q^ ^Tt - 1 



(3) 



Let us now reproduce the derivation of the Belief Propa- 
gation equation based on the Bethe Free energy variational 
principle, following closely the description of [4]. (See also 
the Appendix of [16].) In this approach trial probability 
distributions, called beliefs, are introduced both for bits 
and checks hi and ha, respectively, where i = ,N, 
a = 1 , • • • , AI. A belief is defined for given configuration 
of the binary variables over the code. Thus, a belief at a bit 
actually consists of two probabilities, hi{+) and hi{—), and 
we use a natural notation bi{ai). There are 2^ beliefs defined 
at a check, k being the number of bits connected to the check, 
and we introduce vector notation <Ja = {<Jii , • ■ • , fis, ) where 
■ ■ • ,*fe G o. and <7i = ±1. Beliefs satisfy the following 
inequality constraints 



< 6^(a,),fea((Ta) < 1, 

the normalization constraints 

bi{(Ji) = ^ ha{cra) = 1, 



(4) 



(5) 



as well as the consistency (between bits and checks) con- 
straints 

ba{o-a) = 6i(0-j), (6) 

(Tq \cri 

where cra\ai stands for the set of aj with j ^ a, j ^ i. 



The Bethe Free energy is defined as a difference of the 
Bethe self-energy and the Bethe entropy, 



-pBothc — C^Bothc — -f^Bctho, 



(7) 

(8) 



^^Bethe — — ^ ^ baicTa) lllba{(Ta) 
a (Ta 

+ Y.^q^-l)Y.^^{<y^)^^h{a,), (9) 
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where cr^ = {a^^, ■ ■ ■ ,crij, h, - ■ ■ ,ik € a and = ±1. 
The entropy term for a bit enters Eq. with the coefficient 
1 — Qi to account for the right counting of the number of 
configurations for a bit: all entries for a bit (e.g. through the 
check term) should give +1 in total. 

Optimal configurations of beliefs are the ones that min- 
imize the Bethe Free energy Q subject to the constraints 
( I4l5l6t . Introducing these constraints into the effective La- 
grangian through Lagrange multiplier terms 



(10) 



and looking for the extremum with respect to all possible 
beliefs leads to 
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ba{(Ta) 

=> ba{(Ta) = faicTa) exp 
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(11) 
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(12) 



bi{ai) = exp 
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Substituting X^aioi) = InJl/jgii/j^^a into Eq.Jllll2t 

we arrive at 



i(za f33i 
= /a(cra) ]^exp(AiQ(CTi)) 



(13) 



where cx is used to indicate that we should use the normal- 
ization conditions (|5j to guarantee that the beliefs sum up 
to one. Applying the consistency constraint to Eqs. ( fTst . 



making summation over all spins but the given di, and also 
using Eq. il4l we derive the following BP equations 



(15) 



The right hand side of Eq. il5i rewritten for the LDPC case 
(13 becomes 



(16) 



Thus constructing hi{+) /bi{—) for the LDPC case in two 
different ways (correspondent to left and right relations in 
Eq. (I15» . equating the results and introducing the rji^ field 



exp(2?7iQ) 



(17) 
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one arrives at the following BP equations for the rjia fields; 



]^tanh77j73 I . (18) 

Iterative solution of this equation corresponding to Eq. Q 
with A = +CXD is just a standard iterative BP (which can also 
be called sum-product) used for the decoding of an LDPC 
code. 

A simplified min-sum version of Eq. Q is 
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IL Termination curve for standard iterative BP 

To illustrate the standard BP iterative decoding, given by 
Eqs. Jlll9t with A = +cxd, we consider the example of the 
[155,64,20] code of Tanner [14] performing over AWGN 
channel channel characterized by the transition probability 
for a bit, p(a;|cr) — exp(— s^(x — /2)/ ■\/27r/s2, where 
a and a; are the input and output values at a bit and 
is the SNR. Launching a fixed codeword into the channel, 
emulating the channel by means of a standard Monte-Carlo 
simulations and then decoding the channel output constitutes 
our "experimental" procedure. 

We analyze the probability distribution function of the 
iteration number at which the decoding terminates. The 
termination probability curve for the standard min-sum, 
described by Eq. M9\ with A = +cx), is shown in Fig. ^ 
for SNR= 1,2,3. 

The result of decoding is also verified at each iteration 
step for compliance with a codeword; iteration is terminated 



O 

n. 




1 4 16 64 256 1024 4096 16384 

number of iterations 

Fig. 1. The termination probability curve for SNR = 1,2,3. Notice that 
the probability of termination (successful decoding) without any iterations 
is always finite. Few points on the right part of the plot correspond to the 
case when the decoding was not terminated even at the maximum number 
of iterations, 16384 (decoding fails to converge to a codeword). 



if a codeword is recovered. This termination strategy can still 
give an error, although the probability to confuse actual and 
a distant codewords is much less than the probability not to 
recover a codeword for many iterations. If one neglects the 
very low probability of the codewords' confusion, then the 
probability of still having a failure after nit iterations is equal 
to the integral/sum over the termination curve from and 
up. Note also that the probability that even infinite number 
of iterations will not result in a codeword can actually be 
finite. 

Discussing Fig.Qone observes two distinct features of the 
termination probability curve. First, in all cases the curve 
reaches its maximum at some relatively small number of 
iterations. Second, each curve crosses over to an algebraic- 
like decay which gets steeper with the SNR increase. 

The emergence of an algebraically extended tail (that is 
a tail which does not decay fast) is not encouraging, as 
it suggests that increasing the number of iteration will not 
bring much of an improvement in the iterative procedure. 
It also motivates us to look for possibilities of accelerating 
convergence of the BP algorithm to a minimum of the Bethe 
free energy. 

Note also the wiggling of termination curves for SNR = 
2, 3 near the crossover point (see Fig.Q. It is possibly related 
to the cycling of the BP dynamics (and thus the inability of 
BP to converge). 

III. Relaxation to a minimum of the Bethe free 

ENERGY 

The idea is to introduce relaxational dynamics (damping) 
in an auxiliary time, t, thus enforcing convergence to a 
minimum of the Bethe Free energy. One chooses hi{ai) as 
the main variational field and considers relaxing variational 
equations Eqs. M2\ according to 

9t6^(a.) = --T7^, (20) 



while keeping the set of remaining variational equations 
Eqs. (I5I11I6> intact. Here positive parameters have the 
physical meaning of correlation/relaxation times. Performing 
calculations, that are completely equivalent to the ones 
described in Section |l| we arrive at the following modified 
BP equations 



Qi = ndt tanh 



Via + {qi - 



,(21) 



(22) 



We are interested to approach (find) a solution of the original 
BP Eq. (I18> . One assumes Qi <^ '}2,a3i Via^ thus ignoring the 
second term under tanh in Eq. (I22> . The resulting continuous 
equation is 



{qi - l)Ti 



cosh (Ea9i'7»a 



(23) 



= hi + tanli^^ tanhrjjp 

Eq. represents a simple discretized version of the Eq. i23\ 
where the correlation coefficients are chosen to make the 
coefficient in front of the second term on the left hand side 
of Eq. ( I23> independent of the bit index, i. Then the resulting 
time dependent coefficient can be rescaled to one by an 
appropriate choice of the temporal unit; i„ is the uniform 
discrete time, n is positive integer, tn+i — t,i = A > 0; 
the left hand side (right hand side) of Ea.( l23t is taken at 
tn+i {tn) and the temporal derivative is discretized in a 
standard retarded way, dti]ia iVi2^^'^ ~ 'Hi^)/^- This 
choice of relaxation coefficients and discretization, resulted 
in Eq. ([l), was taken out of consideration in the final formula 
for simplicity, realizability at all positive A and also its 
equivalence to the standard iterative BP at A ^ +00. 

IV. Modified iterative BP: test of performance 

We test the min-sum version M9\ of the modified iterative 
BP with the Monte Carlo simulations of the [155,64,20] 
code at few values of SNRs. The resulting termination 
probability curves are shown in Fig. |2]for SNR = 1, 2, 3. 

The simulations show a shift of the probability curve 
maximum to the right (towards larger number of iterations) 
with the damping parameter decrease however once the 
maximum is achieved, the decay of the curve at a finite A 
is faster with the number of iterations than in the standard 
BP case. The decay rate actually increases as A decreases. 

We conclude that at the largest nit the performance of a 
modified iterative BP is strictly better. However to optimize 
the modified iterative BP, thus aiming at better performance 
than given by the standard iterative BP, one needs to account 
for the trade-off between decreasing A leading to a faster 
decay of the termination probability curve at the largest riit, 
but on the other side it comes with the price in the actual 
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Fig. 2. The termination probability curves for SNR = 1 (a), SNR = 2 
(b), and SNR = 3 (c). 



number of iteration necessary to achieve the asymptotic 
decay regime. 

The last point is illustrated by Fig. |3j where the decoding 
error probability depends non-monotonically on A. One can 
also see that the modification of BP could improve the 
decoding performance; e.g., at SNR = 3 and maximally 
allowed riit = 32 (after which the decoding unconditionally 
stops) the decoding error probability is reduced by factor 
of about 40 by choosing A = 1 (see the bottom curve at 
Fig.Eb)). 
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Fig. 3. Decoding error probability as a function of A for SNR = 2 (a) and 
SNR = 3 (b). Different curves correspond to different maximally allowed 
nit : starting from na = 1 (top curve) and increasing nu by factor of 2 with 
each next lower curve. The points on the right correspond to the standard 
BP (A = oo). 

V. Conclusions and Discussions 

We presented a simple extension of the iterative BP which 
allows (with proper optimization in the A — nit plane) to 
guarantee not only an asymptotic convergence of BP to a 
local minimum of the Bethe free energy but also a serious 
gain in decoding performance at finite na- 

In addition to their own utiUty, these results should also be 
useful for systematic improvement of the BP approximation. 
Indeed, as it was recently shown in [15], [16] solution of the 
BP equation can be used to express the full partition function 
(or a-posteriori log-likehoods calculated within MAP) in 
terms of the so-called loop series, where each term is 
associated with a generalized loop on the factor graph. 
This loop calculus/series offers a remarkable opportunity 
for constructing a sequence of efficient approximate and 
systematically improvable algorithms. Thus we anticipate 
that the improved iterative BP discussed in the present 
manuscript will become an important building block in this 
future approximate algorithm construction. 

We already mentioned in the introduction that our algo- 
rithm can be advantageous over other BP-based algorithms 
converging to a minimum of the Bethe free energy mainly 
due to its simplicity and tunability. In particular, the concave- 
convex algorithms of [9], [10], as well as related linear 
programming decoding algorithms [17], are formulated in 
terms of beliefs. On the contrary our modification of the 
iterative BP can be extensively simpUfied and stated in terms 



of the fewer number of 77 fields each associated with an edge 
of the factor graph rather than with much bigger family of 
local code-words. Thus in the case of a regular LDPC code 
with M checks of the connectivity degree k one finds that 
the number of variables taken at each step of the iterative 
procedure is k*AI and 2*^^^ *M in our iterative scheme and 
in the concave-convex scheme respectively. Having a tunable 
correlation parameter r in the problem is also advantageous 
as it allows generalizations (e.g. by turning to a individual 
bit dependent relaxation rate). This flexibility is particularly 
desirable in the degenerate case with multiple minima of the 
Bethe free energy, as it allows a painless implementation 
of annealing as well as other more sophisticated relaxation 
techniques speeding up and/or improving convergence. 
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